New logger connector code #7882

carmocca · 2021-06-08T11:28:02Z

What does this PR do?

Introduces the new code for #7631

Next PRs:

Update the loops to use this
Delete previous code

cc: @SkafteNicki please review our heavy use of Metric in result_new.py

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

pep8speaks · 2021-06-08T11:28:05Z

Hello @carmocca! Thanks for updating this PR.

In the file pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py:

Line 192:13: W503 line break before binary operator

Comment last updated at 2021-06-08 19:35:08 UTC

codecov · 2021-06-08T11:40:10Z

Codecov Report

Merging #7882 (e2c8c64) into master (b74f8ac) will decrease coverage by 6%.
The diff coverage is 56%.

@@           Coverage Diff           @@
##           master   #7882    +/-   ##
=======================================
- Coverage      92%     86%    -6%     
=======================================
  Files         202     204     +2     
  Lines       13159   13630   +471     
=======================================
- Hits        12155   11788   -367     
- Misses       1004    1842   +838

awaelchli · 2021-06-08T11:42:53Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+        # reset result collection for next epoch
+        self.trainer.result_collection.reset(metrics=True)
+
+    def teardown(self):


is that something we should do in the loops maybe?
just saying, no change requersted

The trainer will call this:

def _post_dispatch(self): self.accelerator.post_dispatch(self) self.accelerator.teardown() self.logger_connector.teardown()

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

awaelchli · 2021-06-08T11:44:29Z

very nice!

requirements.txt

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

tchaton

LGTM ! Great job @carmocca

tchaton · 2021-06-08T11:40:05Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+    Evaluation metric updates
+    """
+
+    def prepare_eval_loop_results(self, metrics: Mapping[str, _METRIC]) -> None:


Could be move to utils section.

Why? It's tied to evaluation

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

carmocca · 2021-06-08T15:51:43Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+        self.trainer.train_loop.results.cpu()
+        self.trainer.evaluation_loop._val_results.cpu()
+        self.trainer.evaluation_loop._test_results.cpu()


@awaelchli do you think each loop should run this on their teardown?
instead of the logger connector doing it

these results are the resultscollection right? I think yes, if the loops own this results collection I think they should handle the teardown

Okay. Do you think I should do that change here or you do it when it's all merged?

Yeah we shouldn't hold this back. Added a TODO to our POC.

kaushikb11 · 2021-06-08T16:49:12Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+        elif step is None:
+            # added metrics by Lightning for convenience
+            scalar_metrics['epoch'] = self.trainer.current_epoch
+            step = self.trainer.global_step


There's a lot of Codecov warnings.

This is expected considering we haven't integrated the loops to use these new files.

You can take a peek at #7631 for the full integration

This PR includes a few unit tests for the code in Result but LoggerConnectorNew is completely unused

ethanwharris

Nice! LGTM 😃 minor comment

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

SkafteNicki

LGTM, the integration with torchmetrics seems nice and definitely clears up some of the older logging code :]

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

ananthsub · 2021-06-08T19:39:51Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+
+
+# TODO(@carmocca): Remove `New` suffix
+class LoggerConnectorNew:


super nit: for renamings i've always been a fan of _v2 instead of New in the worst case circumstance that there are 3 or more versions around at the same time

Thanks for the tip! Will keep it in mind for future refactors

ananthsub · 2021-06-08T19:43:25Z

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py

+            # added metrics by Lightning for convenience
+            scalar_metrics['epoch'] = self.trainer.current_epoch
+            step = self.trainer.global_step


n00b questions:

why does step indicate we should epoch to scalar metrics?

what happens if epoch was already in scalar metrics as another value? if the user does self.log("epoch", ...) ?

why does step indicate we should epoch to scalar metrics?

Didn't get this, can you rephrase?

what happens if epoch was already in scalar metrics as another value?

Well, as you see here, it gets overwritten.

It wouldn't be a bad idea to check if it's there already. Will do in a future PR

ananthsub · 2021-06-08T19:53:03Z

pytorch_lightning/trainer/connectors/logger_connector/result_new.py

+class MetricSource(LightningEnum):
+    CALLBACK = "callback"
+    PBAR = "pbar"
+    LOG = "log"


are these metric sources, or are they destinations?

in the future, i can imagine we want to configure multiple loggers, but want to log metrics to only some of them. how could this be extended for that use case?

are these metric sources, or are they destinations?

Mmm depends on how you think about it.

Do you self.log(prog_bar=True) into the progress bar metrics (destination) or are the progress bar metrics gathered from the prog_bar=True logged values (source)?

If you have an idea for a less confusing name, we can rename without issues 👍

in the future, i can imagine we want to configure multiple loggers, but want to log metrics to only some of them. how could this be extended for that use case?

In that case, I think it's easier for us to filter at the logger level. _Metadata could have a field for the target logger

kandluis · 2021-06-14T19:34:42Z

pytorch_lightning/trainer/connectors/logger_connector/result_new.py

+
+        def check_fn(v):
+            if v.grad_fn is not None:
+                raise MisconfigurationException(f'You returned a tensor with `grad_fn`. The extra values are {extra}')


Could you explain what this checks for? Is it a misconfiguration for users to return tensors with gradients enabled?

IIRC the previous implementation detached extras (silently) internally, however, I feel like it's better for the users to know they are passing tensors with graphs. This facilitates catching potential errors on the user side and it's not much to ask the user to detach if necessary.

Do you think this constraint should be relaxed?

Not entirely sure - just noticed that quite a few of our users were relying on this (silent) behavior and will see exceptions with the latest version of the code here.

Could you share an example of how/what a users should do now? Is the idea they want to call detach on non-loss tensors?

Is the idea they want to call detach on non-loss tensors?

Exactly:

def training_step(...): ... # `something_else` is considered an "extra" return {'loss': loss, 'something_else': something_else.detach()}

…ter) to github/third-party/PyTorchLightning/pytorch-lightning Summary: ## OSS Note these issues are being solved in OSS here: https://github.com/PyTorchLightning/pytorch-lightning/pull/7994/files# ## Manual - `speed_monitor.py` - `Result.unpack_batch_size` has been removed, moved to new implementation. - `fully_sharded.py` - There was a refactor for plugins, so updated corresponding function to keep reduced memory usage. - `hive_writing_classy.py`, `hive_writing_faim.py`, `hive_writing_xrayvideo.py` - Same as `speed_monitor.py`. - [Temporary] Uncommented misconfiguration exception. See Lightning-AI/pytorch-lightning#7882 (review). - Update `TestModel` to detach appropriately. - Manually `detach` metrics stored in ResultStore. ## Automatic ### New commit log messages f7459f53 DeepSpeed Infinity Update (#7234) 03e7bdf8 Improve `LightningModule` hook tests (#7944) 3a0ed02b Properly handle parent modules w/ parameters in `BaseFinetuning` callback (#7931) ce93d8bc Handle errors due to uninitailized parameters (#7642) cca0e753 remove parsing comments (#7958) 898fb56b added on_test_start() documentation (#7962) 22d82661 Seed all workers when using DDP (#7942) 436fc53c Improve `LightningDataModule` hook test and fix `dataloader_idx` argument (#7941) 6b7b4047 deprecate hpc_load() and integrate it with restore() (#7955) 20a5e09e fix myst-parser warning blocking docs ci (#7967) f15ea601 update chlog + legacy chpt (#7954) 59d0c656 Add dataclass support to `apply_to_collection` (#7935) cdd01f32 LightningCLI support for argument links applied on instantiation (#7895) 6856cced Remove rank_zero_only on DataModule prepare_data (#7945) 96433d03 IPU Integration 5/5 (#7867) 42c7f272 refactor checkpoint loading for training type plugins (#7928) ac4eb0a0 `is_overridden` improvements (#7918) 9e932f4d Delete `on_after_backward` unused argument (#7925) 8b738693 Deprecate the default `EarlyStopping` callback monitor value (#7907) c1eac483 split `restore_training_state` into logical parts [2 / 2] (#7900) d209b689 split `restore_training_state` into logical parts [1 / 2] (#7901) 111287b4 add pre-commit hooks (#7906) 839019a3 Remove legacy teardown check in train loop (#7917) b45a89a2 Clean-up after logger connector redesign 2/2 (#7631) 07b69231 Remove fn check for ipu output (#7915) 580a3b5e Remove dead code (#7910) df812398 Clean-up after logger connector redesign 1/2 (#7909) ec4f8856 Enable logger connector re-design (#7891) 15be9865 add logger to __all__ (#6854) 6fee9262 Deprecate `LightningDataModule` lifecycle properties (#7657) 764d2c77 refactor CheckpointConnector.restore_weights (#7862) 7f4ef6d1 Fix logs overwriting issue for remote fs (#7889) c310ce66 Logger connector re-design `_Metadata.reduce_fx` fixes. (#7890) b214442e New logger connector code (#7882) Reviewed By: yifuwang Differential Revision: D29105294 fbshipit-source-id: 990b2a4a7333908d676de193f5ec930cb50b8a19

New logger connector code

0c98c8b

carmocca added feature Is an improvement or enhancement refactor labels Jun 8, 2021

carmocca added this to the v1.4 milestone Jun 8, 2021

carmocca self-assigned this Jun 8, 2021

carmocca requested review from Borda, SeanNaren and tchaton as code owners June 8, 2021 11:28

carmocca added 2 commits June 8, 2021 13:28

Update CHANGELOG

ee78a90

Update requirements

917d644

carmocca requested review from awaelchli, justusschock, kaushikb11 and williamFalcon as code owners June 8, 2021 11:30

carmocca added 2 commits June 8, 2021 13:37

Fix import path

9d97d25

Add new suffix

cd72867

awaelchli approved these changes Jun 8, 2021

View reviewed changes

Borda reviewed Jun 8, 2021

View reviewed changes

requirements.txt Show resolved Hide resolved

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py Show resolved Hide resolved

tchaton approved these changes Jun 8, 2021

View reviewed changes

carmocca added 8 commits June 8, 2021 13:58

Tests

51a3187

Minor changes

792d073

Rename and reorder

cef98a7

Fix import

860206d

Formatting

518d775

Fix with seed_everything?

c5baa2c

Fix test?

51b8f35

Fix test?

0475f42

carmocca added 4 commits June 8, 2021 15:23

Minor changes

c87aaec

Force float

284b247

Fix minimal bug

dac213f

Fix minimal bug

f94969c

mergify bot added the has conflicts label Jun 8, 2021

Update with latest changes

9ee57ca

carmocca commented Jun 8, 2021

View reviewed changes

Merge branch 'master' into refactor/new-logger-connector-code

11204b8

mergify bot removed the has conflicts label Jun 8, 2021

carmocca mentioned this pull request Jun 8, 2021

Use apply_to_collection in metrics_to_scalars #7888

Merged

10 tasks

carmocca added 2 commits June 8, 2021 18:03

Fix import

a2eed6d

bad merge

1638bba

kaushikb11 reviewed Jun 8, 2021

View reviewed changes

mergify bot added the has conflicts label Jun 8, 2021

Merge branch 'master' into refactor/new-logger-connector-code

3165895

mergify bot removed the has conflicts label Jun 8, 2021

ethanwharris approved these changes Jun 8, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/logger_connector/logger_connector_new.py Outdated Show resolved Hide resolved

awaelchli mentioned this pull request Jun 8, 2021

Loop Refactor 5/N - Prediction Loop #7700

Merged

11 tasks

update typing

48249f1

tchaton enabled auto-merge (squash) June 8, 2021 19:33

missing typing

e2c8c64

SkafteNicki approved these changes Jun 8, 2021

View reviewed changes

ananthsub reviewed Jun 8, 2021

View reviewed changes

tchaton merged commit b214442 into master Jun 8, 2021

tchaton deleted the refactor/new-logger-connector-code branch June 8, 2021 20:20

kandluis reviewed Jun 14, 2021

View reviewed changes

This was referenced Jun 15, 2021

Do not override the logged epoch in logged_metrics #7982

Merged

Deprecate returning extras with grads #7994

Merged

kandluis mentioned this pull request Jul 9, 2021

Expose extract_batch_size method and add corresponding tests. #8357

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New logger connector code #7882

New logger connector code #7882

carmocca commented Jun 8, 2021 •

edited

Loading

pep8speaks commented Jun 8, 2021 •

edited

Loading

codecov bot commented Jun 8, 2021 •

edited

Loading

awaelchli Jun 8, 2021

carmocca Jun 8, 2021 •

edited

Loading

awaelchli commented Jun 8, 2021

tchaton left a comment

tchaton Jun 8, 2021 •

edited

Loading

carmocca Jun 8, 2021

carmocca Jun 8, 2021

awaelchli Jun 8, 2021

carmocca Jun 8, 2021

awaelchli Jun 8, 2021

kaushikb11 Jun 8, 2021

carmocca Jun 8, 2021

carmocca Jun 8, 2021

ethanwharris left a comment

SkafteNicki left a comment

ananthsub Jun 8, 2021

carmocca Jun 8, 2021

ananthsub Jun 8, 2021

carmocca Jun 8, 2021

ananthsub Jun 8, 2021

carmocca Jun 8, 2021

kandluis Jun 14, 2021

carmocca Jun 14, 2021

kandluis Jun 14, 2021

carmocca Jun 14, 2021



		# TODO(@carmocca): Remove `New` suffix
		class LoggerConnectorNew:

New logger connector code #7882

New logger connector code #7882

Conversation

carmocca commented Jun 8, 2021 • edited Loading

What does this PR do?

Before submitting

PR review

pep8speaks commented Jun 8, 2021 • edited Loading

Comment last updated at 2021-06-08 19:35:08 UTC

codecov bot commented Jun 8, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

carmocca Jun 8, 2021 • edited Loading

Choose a reason for hiding this comment

awaelchli commented Jun 8, 2021

tchaton left a comment

Choose a reason for hiding this comment

tchaton Jun 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethanwharris left a comment

Choose a reason for hiding this comment

SkafteNicki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carmocca commented Jun 8, 2021 •

edited

Loading

pep8speaks commented Jun 8, 2021 •

edited

Loading

codecov bot commented Jun 8, 2021 •

edited

Loading

carmocca Jun 8, 2021 •

edited

Loading

tchaton Jun 8, 2021 •

edited

Loading